196 research outputs found

    On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks

    Full text link
    Theoretical analysis of the error landscape of deep neural networks has garnered significant interest in recent years. In this work, we theoretically study the importance of noise in the trajectories of gradient descent towards optimal solutions in multi-layer neural networks. We show that adding noise (in different ways) to a neural network while training increases the rank of the product of weight matrices of a multi-layer linear neural network. We thus study how adding noise can assist reaching a global optimum when the product matrix is full-rank (under certain conditions). We establish theoretical foundations between the noise induced into the neural network - either to the gradient, to the architecture, or to the input/output to a neural network - and the rank of product of weight matrices. We corroborate our theoretical findings with empirical results.Comment: 4 pages + 1 figure (main, excluding references), 5 pages + 4 figures (appendix

    Boron nitride photocatalysts for solar fuel synthesis

    Get PDF
    Reshaping our global energy portfolio in light of the rising anthropogenic CO2 emissions is paramount. Solar fuel production via photocatalysis constitutes a sustainable energy generation route, allowing one to harness the abundance of sunlight for CO2 transformation. In this thesis, we develop a new materials platform for boron nitride (BN) photocatalysts in solar fuel synthesis. We present a proof-of-concept for a porous boron oxynitride (BNO) photocatalyst facilitating gas phase CO2 capture and photoreduction, without doping or cocatalysts. We then present two routes to enhance light harvesting and photoactivity in BN: boron- and oxygen doping. Boron doping yielded B-BNO, the first water-stable, photoactive BN material, facilitating liquid phase H2 evolution under deep visible irradiation (λ > 550 nm) and gas phase CO2 photoreduction. In parallel, we demonstrate that tuning the oxygen content in BNO can lower and vary light harvesting to the deep visible region. Using a systematic design of experiments process, we tune and predict the chemical, paramagnetic and optoelectronic properties of BNO. We probe the role of free radicals and paramagnetic states on the photochemistry of BNO using a combined experimental, computational and first-principles approach. The family of BN photocatalysts all exhibit unique paramagnetism, shown to arise from free radicals in isolated OB3 sites, which we unequivocally confirm as the governing state for red-shifted light harvesting and photoactivity in BNO. Finally, we explore a new avenue in BN photocatalyst design and present the first example of semiconducting BNO quantum dots for CO2 photoreduction. The evolution rates, quantum efficiencies, and selectivities of all the BN materials surpassed P25 TiO2 and graphitic carbon nitride - benchmark photocatalysts in the field. Overall, this thesis opens the door to a radically new generation of BN-based photocatalysts for solar fuels synthesis.Open Acces

    ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent

    Full text link
    Two major momentum-based techniques that have achieved tremendous success in optimization are Polyak's heavy ball method and Nesterov's accelerated gradient. A crucial step in all momentum-based methods is the choice of the momentum parameter mm which is always suggested to be set to less than 11. Although the choice of m<1m < 1 is justified only under very strong theoretical assumptions, it works well in practice even when the assumptions do not necessarily hold. In this paper, we propose a new momentum based method ADINE\textit{ADINE}, which relaxes the constraint of m<1m < 1 and allows the learning algorithm to use adaptive higher momentum. We motivate our hypothesis on mm by experimentally verifying that a higher momentum (1\ge 1) can help escape saddles much faster. Using this motivation, we propose our method ADINE\textit{ADINE} that helps weigh the previous updates more (by setting the momentum parameter >1> 1), evaluate our proposed algorithm on deep neural networks and show that ADINE\textit{ADINE} helps the learning algorithm to converge much faster without compromising on the generalization error.Comment: 8 + 1 pages, 12 figures, accepted at CoDS-COMAD 201

    DANTE: Deep AlterNations for Training nEural networks

    Full text link
    We present DANTE, a novel method for training neural networks using the alternating minimization principle. DANTE provides an alternate perspective to traditional gradient-based backpropagation techniques commonly used to train deep networks. It utilizes an adaptation of quasi-convexity to cast training a neural network as a bi-quasi-convex optimization problem. We show that for neural network configurations with both differentiable (e.g. sigmoid) and non-differentiable (e.g. ReLU) activation functions, we can perform the alternations effectively in this formulation. DANTE can also be extended to networks with multiple hidden layers. In experiments on standard datasets, neural networks trained using the proposed method were found to be promising and competitive to traditional backpropagation techniques, both in terms of quality of the solution, as well as training speed.Comment: 19 page

    Journalists in New Zealand

    Get PDF
    corecore